Method and apparatus for spatio-temporal subband video enhancement with small time delay

ABSTRACT

Video processing method and means for enhancing a video stream, by computing transform coefficients using a spatio-temporal transform comprising a spatial subband transform and a causal time wavelet transform performing filterings with multiscale causal wavelets, modifying the transform coefficients with a nonlinear processing, and computing a processed video stream from the modified transform coefficients using a spatio-temporal reconstruction transform comprising an inverse subband transform and a short delay inverse time wavelet transform, where the short delay inverse time wavelet transform is implemented with wavelet filters modified with window functions to control the processing delay of the entire video processing method.

BACKGROUND OF INVENTION

The present invention relates generally to a video enhancement method,apparatus and computer program, and in particular to a method, apparatusand computer program useful for enhancing the visual quality of videos.

The acquisition process of a video stream often introduces distortionsand noise. Video camera introduce electronic noise and blur due toimperfect optics. Other videos such as medical X rays or infra-redvideos have other types of noise and the resolution is limited by theacquisition process. In addition, distortions may also be introduced bydigital video compression. For example, MPEG 2 or MPEG 4 compressionstandards introduce block effects and mosquito noise that reduce thevideo quality. Transport of video over analog channels also incorporatesnoise into the video signal.

Video noise and compression artifacts can be attenuated with a linearfiltering in space and time but this process introduces blur along sharptransitions and fast moving objects. The enhancement process of a videomust introduce a limited time delay in most applications. In television,medical and military applications, it may even be necessary to usecausal procedures that only process past images in order to restore avideo with nearly no delay. Recursive time filters are generally usedfor this purpose.

To reduce the blur introduced by linear filters, adaptive filteringtechniques have been introduced. A parameter adjustment of the timerecursive filters is incorporated in order to reduce the averaging whenthe scene is moving. This parameter adjustment can be incorporated inthe more general framework of a Kalman filtering. However, there is nosufficiently reliable model of video images that allows to find robustparameter adjustment procedures. As a result, the range of adaptivity isoften small in order to avoid making important errors. Moreover, theparameter adjustment does not take into account the joint time and spaceimage properties.

For an image, efficient adaptive noise removal algorithms areimplemented with thresholding strategies applied to the output of asubband transform such as a wavelet transform, or a wavelet packettransform or a bandlet transform. Thresholding subband images isequivalent to adaptively average the input image where there is no sharptransition. Blur removal can also be implemented with a sharpening whichincreases the amplitude of high frequency subband images, withparameters that depend upon the blur.

For videos, a spatio-temporal subband transform, with a combination of aspatial wavelet transform and a time wavelet transform, replaces thesubband transform used for images. Non-linear operators such asthresholding operators are applied to the resulting spatio-temporalsubband images and an enhanced video image is reconstructed by combiningan inverse time wavelet transform and an inverse spatial subbandtransform. Such algorithms adaptively remove the noise depending uponthe local sharpness and motion of video structures. However, state ofthe art video processing methods use a combination of a time wavelettransform and an inverse time wavelet transform that introduces a timedelay that is typically equal to the maximum time support of multiscalewavelets. To take advantage of time redundancy, this maximum timesupport must be sufficiently large but this produces a large time delay.The resulting delay is often too large for real-time video enhancementapplications, in particular when delays close to zero are required.

Accordingly, there exists a need in the art for improvingspatio-temporal subband trans-form methods for video enhancement, byintroducing a combination of a time wavelet trans-form and an inversetime wavelet transform that produces a delay d that does not depend uponthe maximum time support of multiscale wavelets, and which canpotentially be set to zero for causal video enhancement methods.

In addition, many video sources (and in particular medical X-ray videoimages or defense and security night-vision video images) have a dynamicrange that cannot be displayed on the available displays, and applying asharpening process is useful for increasing the legibility of the videoor for making it look nicer. This sharpening process can be applied on awavelet transform of a video sequence to enhance its local contrast, andthere equally exists a need in the art for improving spatio-temporaltransform methods for video sharpening or for a combined videoenhancement and sharpening with a limited delay d.

SUMMARY OF THE INVENTION

It is a primary object of the invention to devise a method and means ofvideo processing to perform an enhancement of the video comprising noiseremoval, blur removal or sharpening with a short time delay d with d anonnegative integer. In this invention, the video enhancement processcomprises a causal spatio-temporal transform, causal non-linearoperators and a delay-d spatio-temporal reconstruction. Thespatio-temporal transform comprises a causal time wavelet transformperformed by filtering in time the video sequence with multiscale causalwavelets, and a spatial subband transform. The delay-d spatio-temporalreconstruction comprises an inverse of the spatial subband transform,and a delay-d inverse of the time wavelet transform. The inverse of thetime wavelet transform is implemented using filterings with multiscalereconstruction wavelets that are multiplied with a window function. Thewindow function can either be a function of support [−d,0], or a sum ofnonnegative translates of a function of support [−d,0]. This ensuresthat the delay-d inverse time wavelet transform is an exact inverse ofthe causal time wavelet transform and that at the same time theprocessing delay of the whole video processing method is not larger thand. It is possible the choose a delay d=0 or a positive delay. The causalnon-linear operators can be thresholding operators applied coefficientby coefficient, or deblurring or enhancement operators, or anycombination thereof.

The spatio-temporal transform can be implemented by performing first acausal time wavelet transform and then a spatial subband transform. Thespatio-temporal reconstruction transform can be implemented in severalways. In particular, it can be implemented with a sequence of an inversetime wavelet transform followed by a inverse spatial subband transform.Also, the spatio-temporal reconstruction transform can compriseadditional spatial nonlinear operators inserted after the inverse timewavelet transform and spatial subband transform. The spatio-temporalreconstruction is thus not called an inverse spatio-temporal transform,as it may incorporate non-linear processing steps so that it is not inall cases a substantial inverse of the spatio-temporal transform.

The spatial subband transform can be chosen among a variety of knowntransforms, like for example a discrete wavelet transform, a dyadicwavelet transform or a bandelet transform.

In an exemplary embodiment, the multiscale causal wavelets used in thecausal time wavelet transform comprises boundary wavelets designed for abiorthogonal wavelet trans-form on [0,+∞), as described by Anderson,Hall, Jawerth and Peters. In yet another exemplary embodiment, themultiscale causal wavelets comprise Haar wavelets.

In an exemplary embodiment of the invention, the multiscale causalwavelets are Haar wavelets, and the causal time wavelet transform iscomputed recursively for each scale using multiscale averages, and usingmultiscale weighted differences.

The invention also includes a video scaling apparatus, comprisingcomputer means arranged to carry out a method as disclosed above.

The invention also includes a computer program product, comprisinginstructions to carry out a method as disclosed above, when the programis run in a computer processing unit.

BRIEF DESCRIPTION OF THE DRAWINGS

The foregoing and other objects of this invention, the various featuresthereof, as well as the invention itself, may be more fully understoodfrom the following description, when read together with the accompanyingdrawings in which:

FIG. 1 shows, in block diagram form, an exemplary embodiment of theinvention which takes in input a video stream and computes an enhancedoutput video with a small time delay.

FIG. 2 shows, in block diagram form, an exemplary embodiment of a causalspatio-temporal transform.

FIG. 3 shows, in block diagram form, an exemplary embodiment of adelay-d spatio-temporal reconstruction.

FIG. 4 illustrates a delay-d inverse time wavelet transform, when thewavelets used are the Haar wavelets with a delay d=0.

FIG. 5 illustrates a delay-d inverse time wavelet transform, when thewavelets used are the Haar wavelets with a delay d=2.

FIG. 6 illustrates a delay-d inverse time wavelet transform withwavelets of support larger than that of Haar wavelets with a delay d=0.

FIG. 7 illustrates a delay-d inverse time wavelet transform withwavelets of support larger than that of Haar wavelets with a delay d=2.

FIG. 8 illustrates the efficient hierarchical time wavelet transformproposed for the Haar wavelet.

FIG. 9 illustrates the rotation of coefficient registers in an efficienthierarchical time wavelet transform proposed for the Haar wavelet.

FIG. 10 illustrates a chip or chip-set comprising means for carrying outa video enhancement in a video apparatus.

DETAILED DESCRIPTION Input Video Stream and Enhanced Video Delayed by dImages

FIG. 1 shows a system exemplifying the present invention. It takes ininput a digital video stream (101) which is a sequence indexed by aninteger temporal coordinate t, of frames indexed by integer spatialcoordinates n=(n₁,n₂). The video pixel values are denoted v[n,t]. Thevideo frames are of size N₁×N₂ so the spatial coordinates n₁ and n₂ lierespectively in [1, N₁] and [1, N₂].

In the present invention, the images of the digital video stream are fedto the system depicted in FIG. 1 in the order of their time indexes t.The system of FIG. 1 processes this digital video stream (101) on line,which means that it processes each image in turn as soon as it is fedinto the system, and outputs an enhanced video stream (107) of pixelvalues {tilde over (v)}[n,t] with an image delay d: at time t, after anew input image v[n,t] has been fed, the system outputs the enhancedimage of time index t−d of values {tilde over (v)}[n,t−d] and is anenhanced version of the input image at time t−d of values v[n,t−d].

Note that in the present invention, the processing is said to have adelay of d frames if it is necessary to have received a frame at timeindex t to be able to compute an output frame at time index t−d. Inpractice, implementation constraints impose an additional delay oftypically a fraction of a frame. This fraction depends on the spatialspan of the entire processing and is not evaluated here. One thus has toexpect

System Overview

In FIG. 1, the causal spatio-temporal transform (102) takes in input adigital video stream (101) of samples v[n,t] to perform a spatial andtemporal decorrelation of the input video and compute spatio-temporalsubband images (103) of coefficients c_(l,j,k)[m,t]. The spatio-temporaltransform is causal, which means that to compute a coefficientc_(l,j,k)[m,t], only samples v[n,t′] with t′≦t are needed. In anexemplary embodiment, the causal spatio-temporal transform (102) isimplemented with a spatial subband transform, followed by a causal timewavelet transform. A preferred embodiment is illustrated in FIG. 2,wherein the causal spatio-temporal transform (102) is implemented with acausal time wavelet trans-form (201) followed by a spatial subbandtransform (202). The indexes l,m denote spatial transform indexes,whereas j, k, t denote temporal transform indexes. Typically, l is ascale and orientation index (carrying information on the scale andsubband number), and m is a spatial position index. j is a temporalscale index, and k is a temporal filter shape index.

The spatio-temporal subband images (103) are then processed with acausal spatio-temporal non-linear processor (104) to compute modifiedspatio-temporal subband images (105) of coefficients {tilde over(c)}_(l,j,k)[m,t]. In an exemplary embodiment, the processor (104)includes thresholding operators to remove noise.

Then, the delay-d spatio-temporal reconstruction (106) takes in inputthe modified spatio-temporal subband images (105) and computes anenhanced video stream (107) of samples {tilde over (v)}[n,t]. In anexemplary embodiment, the delay-d spatio-temporal reconstruction (106)is implemented with a delay-d inverse time wavelet transform, followedby an inverse spatial subband transform. In another exemplaryembodiment, the delay-d spatio-temporal reconstruction (106) isimplemented with an inverse spatial subband transform, followed by adelay-d inverse time wavelet transform. In either case, the delay-dspatio-temporal reconstruction (106) is a substantial inverse of thecausal spatio-temporal transform (102). A preferred embodiment isillustrated in FIG. 3: the delay-d spatio-temporal reconstruction (106)is implemented by a delay-d inverse time wavelet transform (301),followed by a spatial nonlinear processor (302) composed of non-linearoperators, followed by an inverse spatial subband transform (303). Inthis last embodiment, the delay-d spatio-temporal reconstruction (106)is not a substantial inverse of the spatio-temporal transform, becauseit incorporates in addition nonlinear operators in (302) to enhance thevideo stream that are intertwined with the spatial and temporal inversetransforms (301) and (303). In yet another exemplary embodiment, thespatial nonlinear processor (302) includes thresholding operators toremove noise, or amplification operators for image sharpening, or acombination thereof.

Within the scope of this invention, the temporal transform (201) and itsinverse (301) are performed using filterings along the time axis. Also,the spatial transform (202) and its inverse (303) are performed usingfilterings along the spatial axes, independently on each image. Theseoperators commute and thus in any chained combination of theseoperators, the order in which these operators are implemented can bechanged without changing the output of said chained combination ofoperators.

The particular structure of the spatial transforms (202) and (303) aswell as that of the temporal transforms (201) and (301) make it possibleto describe the temporal transforms as operating on one-dimensionalsignals that are temporal threads of pixels or coefficients, and thespatial transforms as operating on single images.

For the sake of clarity, the present invention is mostly described withexemplary embodiments using real pixel values (i.e. with a singlechannel) which is for example the case of grayscale video images. It ishowever apparent to those skilled in the art that the present inventioncan be applied to multichannel pixel values like color video images witha variety of standard techniques. For example, a color digital videostream can be considered as a triple of grayscale video streams whichcan be each separately processed by an instance of the presentinvention.

The Causal Time Wavelet Transform and its Delay-d Inverse

In FIG. 2, the causal time wavelet transform (201) takes in inputdigital video images from the digital video stream (101) and outputstemporal subband images of coefficients d_(j,k)[n,t] which are theresult of a filtering of the input video images with a multiscale causalwavelet family.

The delay d inverse time wavelet transform (301) is essentially aninverse of the time wavelet transform (201) and takes in inputspatio-temporal subband images of coefficients {tilde over(c)}_(l,j,k)[m,t] and computes spatial subband images w_(l)[m,t].

As explained in the above section, the time wavelet transform and itsinverse operate independently of the spatial coordinate n or m, so thiscoordinate will be omitted in the description below.

The filtering of the causal time wavelet transform is performed with astate of the art filtering algorithm with a suitable multiscale causalwavelet family. This family is essentially defined by a scale parameter2^(J) and a family of discrete causal multiscale wavelets {ψ_(j,k)}where j is an integer between 1 and J corresponding to the scale 2^(j).The integer k is here to allow a plurality of filters per scale 2^(j)and kε[k_(j),K_(j)].

This family is chosen in such a way that there exists a correspondingreconstruction wavelet family {{circumflex over (ψ)}_(j,k)} thatsatisfies for all d in a predefined interval [0,D] a reconstructionproperty

$\begin{matrix}{{\sum\limits_{j = 1}^{J}{\sum\limits_{k = k_{j}}^{K_{j}}{{\psi_{j,k}\left( {t + d} \right)}{{\hat{\psi}}_{j,k}\left( {- d} \right)}}}} = {{\delta (t)}\mspace{14mu} {for}\mspace{14mu} {all}\mspace{20mu} t}} & (1)\end{matrix}$

where δ is the Dirac filter, having only one nonzero coefficient equalto 1 at 0.

In a preferred embodiment, the multiscale causal wavelet family isobtained from a biorthogonal wavelet basis of the half-line interval[0,+∞) as the one constructed by Andersson, Hall, Jawerth and Peters(“Wavelets on Closed Subsets of the Real Line”, In Recent Advances inWavelet Analysis, L. L. Schumaker, G. Webb (eds.), 1994). The family ofdiscrete causal multiscale filters {ψ_(j,k)} are the analysis waveletssupported on the half line [0,+∞), while the reconstruction wavelets{circumflex over (ψ)}_(j,k) are defined by reversing the dual wavelets{tilde over (ψ)}_(j,k) supported on [0,+∞), i.e. using the formula:

{circumflex over (ψ)}_(j,k) [n]={tilde over (ψ)} _(j,k) [−n]

The subset of indexes j,k can be chosen as follows: j is in the interval[1,J], and for each j, only the indexes k for which the support of{tilde over (ψ)}_(j,k) intersects the interval [0,D] are used. For thesake of simplicity of our notations, the scaling functions and dualscaling functions used in a wavelet transform on the half-line andusually denoted φ_(J,k) and {tilde over (φ)}_(J,k) are noted hereψ_(J,−1−k) and {tilde over (ψ)}_(J,−1−k). With these notations,nonnegative k indexes denote wavelets and negative k indexes denotescaling functions. Again for the sake of simplicity, we will assume thatthe subset of indexes k for which the support of intersects the interval[0,D] is an interval [k_(j);K_(j)]. Usually, for k larger than some k₀,all wavelets ψ_(j,k) have the same shape, i.e. ψ_(j,k)[t]=ψ_(j,k) ₀[t−2^(j)(k−k₀)], which allows to reduce significantly the amount ofcomputations required in the causal time wavelet transform and itsdelay-d inverse. On the other hand, wavelets ψ_(j,k) for k<k₀ have asupport which includes the boundary point {0}. They all have differentshapes, and are named “boundary wavelets”.

In another exemplary embodiment, the ψ_(j,k) wavelets are multiwaveletswherein j is a scale index and k is a wavelet shape index.

The causal time wavelet transform d_(j,k)[t] of a signal s[t] isobtained by filtering s with the multiscale causal wavelet family{ψ_(j,k)} resulting in a time wavelet transform signal d_(j,k)[t]:

d _(j,k) [t]=s*ψ _(j,k) [t]

where * is the convolution operator. As the filters are causal, thecomputation of all d_(j,k)[t] is possible once s[t] has been received ininput. Note that when using biorthogonal wavelets on the half-line asdesigned by Andersson, Hall, Jawerth and Peters, this is an unusual wayof computing a wavelet transform, because usually these wavelets areused to compute scalar products and not convolutions. It is howeverapparent to those skilled in the art that if the wavelets are chosenaccordingly, the convolutions can be computed in an efficient way withfilter bank algorithms.

The delay d time inverse wavelet transform reconstructs a signal s[t]from the set of coefficients d_(j,k)[t] as a sum of filtered timewavelet transform signals:

$\begin{matrix}{{s\lbrack t\rbrack} = {\sum\limits_{j = 1}^{J}{\sum\limits_{k = k_{j}}^{K_{j}}{d_{j,k}*{{\hat{\psi}}_{j,k}^{d}\lbrack t\rbrack}}}}} & (2)\end{matrix}$

where the filters {circumflex over (ψ)}_(j,k) ^(d) of the reconstructionfilter family are defined by

{circumflex over (ψ)}_(j,k) ^(d)={circumflex over (ψ)}_(j,k) ×W

with W a windowing function supported in [−d,0] and such that

${\sum\limits_{t = {- d}}^{0}{W\lbrack t\rbrack}} = 1.$

The “×” symbol denotes coefficient-wise multiplication of signals orfilters. The reconstructions filters are anti-causal and supported in[−d,0], the delay induced by the reconstruction is thus d. It appearsthat some filters {circumflex over (ψ)}_(j,k) ^(d) are zero filters andcan be removed from the reconstruction computation. Note that for largevalues of d, the delay-d time inverse wavelet transform might beequivalent to a state of the art dyadic wavelet reconstructiontransform. However, for values of d typically smaller than the largestsupport of the dual wavelets, the present transform is substantiallydifferent from any state of the art transformation.

In a preferred embodiment, the window function W is defined asW[t]=1_([−d,0])[t]/(d+1), where 1×[t]=1 if tεX and 0 else.

FIG. 4 illustrates an exemplary embodiment of the inverse delay-0 timewavelet trans-form and the wavelet coefficients involved in thecomputation of s[t] in the case when the wavelets are Haar wavelets andJ=4. Each dot like (401) or (402) represents a wavelet coefficientd_(j,k)[t]. As is customary for wavelet transforms, the horizontalcoordinate of each dot represents its substantial temporal locationt−k2^(j) and its vertical coordinate its scale j. As a special case, thedots representing coefficients of the largest scale d_(J,k)[t] like(401) also represent a corresponding scaling function coefficientd_(J,−1−k)[t]. The horizontal line (403) represents the time line of tinstants (one tick per t). The coefficients actually used in thecomputation of s[t] are black. All others are white. Due to the smallsupport of the Haar wavelet, only one wavelet coefficient per scale j isneeded in the computation of s[t] in addition to one scaling functioncoefficient.

FIG. 5 illustrates an exemplary embodiment of the inverse delay-d timewavelet trans-form and the wavelet coefficients involved in thecomputation of s[t−2], performed at time t, when the delay d=2. Thesamples and coefficients that be computed at time t are displayed, andFIG. 5 shows how the reconstructed sample at time t−2 is computed, thusachieving a delay of 2. The same conventions as in FIG. 4 apply.However, coefficients d_(j,k)[t−2] are represented with a smalladditional offset to the top to distinguish them from coefficientsd_(j,k′)[t]. The same is done for coefficients d_(j,k)[t−1] foruniformity. As an example, the dots (501), (502) and (503) representrespectively wavelet coefficients d_(4,1)[t], d_(4,1)[t−1] andd_(4,1)[t−2] together with scaling function coefficients d_(4,−1−1)[t],d_(4,−1−1)[t−1] and d_(4,−1−1)[t−2]; (504), (505) and (506) representthe wavelet coefficients d_(3,3)[t], d_(3,3)[t−1] and d_(3,3)[t−2].Again the coefficients actually used in the computation of s[t−2] aresingularized in black. Three or four wavelet coefficients per scale areused for reconstructing s[t−2].

FIG. 6 illustrates an exemplary embodiment of the inverse delay-0 timewavelet trans-form in a case similar to that of FIG. 4, except that thewavelets used have a larger support than the Haar wavelet, so that thenumber of coefficients used for each scale is larger than 1.

In the same spirit, FIG. 7 illustrates an exemplary embodiment of theinverse time wavelet transform in a case similar to that of FIG. 5except that the wavelets used have again a larger support than the Haarwavelet.

In a preferred embodiment, if some filters are identical up to a shift,this redundancy is used to reduce the amount of computations required bythe method. For instance, for each scale index j, if there is an indexk_(j) ⁰ such that for any k≧k_(j) ⁰, the wavelets ψ_(j,k) have the sameshape, and the dual wavelets {tilde over (ψ)}_(j,k) have the same shape,i.e.:

ψ_(j,k) [n]=ψ _(j,k) _(j) ₀ [n−2^(j)(k−k _(j) ⁰)]

ψ_(j,k) [n]={tilde over (ψ)} _(j,k) _(j) ₀ [n−2^(j)(k−k _(j) ⁰)]

and then also d_(j,k)[n]=d_(j,k) _(j) ₀ [n−2^(j)(k−k_(j) ⁰)].

The computational cost of the reconstruction is reduced by replacing theformula (2) with:

${s\lbrack t\rbrack} = {\sum\limits_{j = 1}^{J}\left( {{\sum\limits_{k = k_{j}}^{k_{j}^{0} - 1}{d_{j,k}*{{\hat{\psi}}_{j,k}^{d}\lbrack t\rbrack}}} + {d_{j,k_{j}^{0}}*{{\hat{\Psi}}_{j,k_{j}^{0}}^{d}\lbrack t\rbrack}}} \right)}$where${{\hat{\Psi}}_{j,k_{j}^{0}}^{d}\lbrack t\rbrack} = {{{\hat{\psi}}_{j,k_{j}^{0}}\lbrack t\rbrack} \times {\sum\limits_{l = 0}^{K_{j} - k_{j}^{0}}{{W\left\lbrack {t - {2^{j}l}} \right\rbrack}.}}}$

The window

$\sum\limits_{l = 0}^{K_{j} - k_{j}^{0}}{W\left\lbrack {t - {2^{j}l}} \right\rbrack}$

appearing in this formula is a sum of non-negative translates of theoriginal window W. A translate of a discrete window function W[t] isdefined as a function W[t−τ] with τ integer. The translation is said tobe nonnegative when τ≧0. In the above mentioned embodiment, each τ is ofthe form l×2^(j).

The Spatial Subband Transform and its Inverse

The spatial subband transform (202) and its inverse (303) are chosenamong the large choice of linear invertible spatial subband transform of2D images, a description of which can be found in “A Wavelet Tour ofSignal Processing” by Stéphane Mallat, Academic Press, 1999, ISBN0-12-466606-X. Both transforms are applied on a video signal or on a setof subband transform coefficients frame by frame, so they are bothcausal.

In an embodiment, the subband transform used is an orthogonal orbiorthogonal 2-dimensional wavelet transform which is a tool well knownto those skilled in the art. This wavelet transform is obtained with acombination of filtering and subsampling steps. The correspondinginverse wavelet transform is a combination of oversampling and filteringsteps.

In yet another embodiment, the subband transform is an orthogonal orbiorthogonal wavelet packet transform. It is apparent to those skilledin the art that a large number of variations is possible, includingmultiwavelet subband transforms, boundary wavelets, and that thesevariations can be applied to the present invention without departingfrom its scope.

In yet another embodiment, the subband transform is made redundant byessentially removing some or all of the subsampling operators andupsampling the filters in the spatial subband transform and thereconstruction filters in the inverse subband transform accordingly.This is the “à trous” algorithm of the dyadic wavelet transform or thedyadic wavelet packet transform.

In a yet another embodiment, the subband transform is a bandlettransform, as described in “Sparse Geometric Image Representation withBandlets”, Erwan Le Pennec, Stéphane Mallat, IEEE Trans. on Image Proc.vol. 14, no. 4, pp. 423-438, April 2005.

These spatial subband transforms take in input an image i[n] of sizeN₁×N₂ and output through a combination of filtering and subsamplingoperations a set of spatial subband coefficients w_(l)[m] indexed by ascale/orientation index l and a position index m. The index l is relatedto the sequence of filtering and subsampling steps that have been usedto compute said spatial subband coefficient. In the case of abiorthogonal wavelet transform,/carries the scale information j and asubband number information o=1 . . . 3.

The inverse spatial subband transform takes in input a set of spatialsubband coefficients w_(l)[m] and recovers the image i[n] with variouscombinations of oversampling and filtering operations.

Non Linear Processors

The causal spatio-temporal non linear processor (104) and the spatialnon linear processor (302) modify respectively the spatio temporalsubband images (103) of coefficients c_(l,j,k)[m,t] and the spatialsubband images output by the delay d inverse time wavelet transform(105) of coefficients w_(l)[m,t]. These non-linear processors can be anystate of the art noise removal or blur removal or sharpening coefficientbased methods, or any combination thereof. Such methods include, but arenot limited to, a combination of a thresholding operator for the noiseremoval and a sharpening for the blur removal or local contrastenhancement.

In an exemplary embodiment of the causal spatio-temporal nonlinearprocessor (104), a noise removal method is implemented with a hardthresholding operator with a threshold T, specified by the user orestimated by any state of the art method, that computes a {tilde over(c)}_(l,j,k)[m,t] from c_(l,j,k)[m,t] according to

${{\overset{\sim}{c}}_{l,j,k}\left\lbrack {m,t} \right\rbrack} = \left\{ \begin{matrix}0 & {{{if}\mspace{14mu} {{c_{l,j,k}\left\lbrack {m,t} \right\rbrack}}} \leq T} \\{c_{l,j,k}\left\lbrack {m,t} \right\rbrack} & {{otherwise}.}\end{matrix} \right.$

Typically T is chosen as 3σ where σ is an estimation of the standarddeviation of the noise present in the signal.

In another embodiment, the hard thresholding operator is replaced with amore general thresholding operator ρ_(T) indexed by a threshold T andthe nonlinear processor computes

{tilde over (c)} _(l,j,k) [m,t]ρ _(T)(c _(l,j,k) [m,t]).

In this invention, the thresholding operator can be any state of the arthard or soft thresholding operator.

In another embodiment, the value of {tilde over (c)}_(l,j,k)[m,t] doesnot only depend on the value of {tilde over (c)}_(l,j,k)[m,t] but alsoon the values of a spatio-temporal neighborhood. Any state of the artneighborhood based method can be used in this invention, provided thatthe neighborhoods used are causal.

In an exemplary embodiment, the spatial non linear processor (302) is asharpening operator using the following non linear processing on eachcoefficients. An amplification parameter α, typically larger than 1 andan attenuation parameter β, typically smaller than 1 are chosen and theenhanced spatial subband coefficient is computed from the values of thespatial subband coefficient w_(l)[m,t] and its parent in the subbandtransform w_(l′)[m′,t] according to:

${{\overset{\sim}{w}}_{l}\left\lbrack {m,t} \right\rbrack} = \left\{ \begin{matrix}{w_{l}\left\lbrack {m,t} \right\rbrack} & {{{if}\mspace{14mu} {{w_{l}\left\lbrack {m,t} \right\rbrack}}} > {\beta {{w_{l^{\prime}}\left\lbrack {m^{\prime},t} \right\rbrack}}}} \\{\alpha \; {w_{l}\left\lbrack {m,t} \right\rbrack}} & {{otherwise}.}\end{matrix} \right.$

In the case of orthogonal or biorthogonal wavelets, the indexes l′, m′of the parent coefficient of w_(l)[m,t] are defined if l=(j,n) asl′=(j+1,n) and m′=└m/2┘.

Either processor can be a combination of a thresholding operator and asharpening operator. Furthermore, the parameters may vary depending onthe scale/direction index l without departing from the spirit of theinvention. Theses parameters can also be modified locally with the helpof an external map specifying a region of interest in the video or amore general segmentation. Also, when processing color video images,each channel can be processed with different nonlinear operators. Inaddition, the parameters used to process a given channel may depend onthe actual value of coefficients in a different channel withoutdeparting from the spirit of the invention either.

In general, the spatio-temporal nonlinear processor (104) is causal,which means that the nonlinear processing is performed on thespatio-temporal subband image coefficients independently on eachcoefficient, or using values in a spatio-temporal neighborhood of eachcoefficient, provided that the neighborhood is causal. It is apparent tothose skilled in the art that it is also possible to devise methods withnon causal neighborhoods introducing an additional delay d′ in a thennon-causal spatio-temporal non-linear processor (104), and that theresulting delay of the entire video enhancement method depicted in FIG.1 is then d+d′ instead of simply d. Such embodiments equally lie withinthe scope of the present invention.

Time Wavelet Transform with Haar Wavelets and Multiscale Averaging

In a preferred embodiment, the wavelet used in the causal time wavelettransform (201) and its delay d inverse (301) is the Haar wavelet. Thischoice allows an efficient process with a small number of operations perpixel that requires a moderate amount of image buffer or image “shiftregisters”.

The causal Haar wavelet family with scale J is the set of functions{ψ_(j,k):1≦j≦J and 0≦k<2^(J−j)}∪{ψ_(J,−1)} where for k≧0

${\psi_{j,k}\lbrack n\rbrack} = \left\{ {{{\begin{matrix}\frac{1}{2^{j/2}} & {{{if}\mspace{14mu} k} \leq {n\; 2^{- j}} < {k + {1/2}}} \\{- \frac{1}{2^{j/2}}} & {{{{if}\mspace{14mu} k} + {1/2}} \leq {n\; 2^{- j}} < {k + 1}} \\0 & {else}\end{matrix}{and}\mspace{14mu} {for}\mspace{14mu} k} \geq {0{\psi_{j,{{- 1} - k}}\lbrack n\rbrack}}} = \left\{ \begin{matrix}\frac{1}{2^{j/2}} & {{{if}\mspace{14mu} k} \leq {n\; 2^{- j}} < {k + 1}} \\0 & {{else}.}\end{matrix} \right.} \right.$

With {circumflex over (ψ)}_(j,k)[t]=ψ_(j,k)[−t], the reconstructionproperty (1) is satisfied for any delay d<2^(J). The reconstructionfilter family {{circumflex over (ψ)}_(j,k) ^(d)} is then defined for agiven delay d and the window

$W = {\frac{1_{\lbrack{{- d},0}\rbrack}}{d + 1}\mspace{14mu} {by}}$

${{\hat{\psi}}_{j,k}^{d}\lbrack t\rbrack} = \left\{ \begin{matrix}{\frac{1}{d + 1}{\psi_{j,k}\left\lbrack {- t} \right\rbrack}} & {{{if}\mspace{14mu} - d} \leq t \leq 0} \\0 & {{otherwise}.}\end{matrix} \right.$

The causal multiscale wavelet family is redundant asψ_(j,k)[t]=ψ_(j,0)[t−k2^(j)] for j≦J and k≧0. In a preferredimplementation, the causal multiscale wavelet family is reduced to{ψ_(j,0)}_(j=1, . . . J)∪{ψ_(J,−1)} while the reconstruction filterfamily is reduced to {{circumflex over (ψ)}_(j,0)^(d)}_(j=1, . . . , J)∪{{circumflex over (ψ)}_(J,−1) ^(d)} where

${{\hat{\psi}}_{j,k}^{d}\lbrack t\rbrack} = {{{\hat{\psi}}_{j,k}\lbrack t\rbrack} \times {\sum\limits_{l = 0}^{{div}{({d,2^{j}})}}{W\left\lbrack {t - {l\; 2^{j}}} \right\rbrack}}}$

with div the Euclidean division operatori, i.e. div(a,b)=└a/b┘.

As the {circumflex over (ψ)}_(j,k), are supported in [−2^(j)+1,0],{circumflex over (ψ)}_(j,k) ^(d) can be expressed as

ψ̂_(j, k)^(d)[t] = ψ̂_(j, k)[t] × γ(t, d, j) with${\gamma \left( {t,d,j} \right)} = \left\{ \begin{matrix}{\frac{1}{d + 1}\left( {{{div}\left( {d,2^{j}} \right)} + 1_{{- t} < {{rem}{({d,2^{j}})}}}} \right)} & {{{if}\mspace{14mu} - 2^{j}} < t \leq 0} \\0 & {otherwise}\end{matrix} \right.$

with div and rem respectively the Euclidean division and remainderoperators, i.e. div(a,b)=└a/b┘ and rem(a,b)=a−b×└a/b′.

In a preferred embodiment, the causal time Haar wavelet transform isimplemented in an efficient way with the use of a multiscale averagingtransform computed with a hierarchical averaging method. This methodreduces the number of images to be read or written as well as the numberof operations per pixels in the causal time wavelet transform. Itrequires a buffer of 2^(J)−2 multiscale average images in which only Jimages are used at each time t and only one past video image v(n,t−1).This buffer has to be compared with the 2^(J)−1 past images to bebuffered and used in the direct convolution implementation. As the timewavelet transform and its inverse operate independently of the spatialcoordinate n, this coordinate will be omitted in the description below.The buffer of values that need to be stored is illustrated in FIG. 8 fora causal time Haar wavelet transform which is used in conjunction with adelay-2 inverse time Haar wavelet transform.

In essentially the same way as in FIG. 4-7, each upper half of a dotrepresents a wavelet coefficient or a register to store a waveletcoefficient. As opposed to more general wavelet transforms, the Haarwavelets all have the same shape, so ψ_(j,k)[t]=ψ_(j,0)[t−2^(j)k] fork≧0, and d_(j,k)[t]=d_(j,0)[t−2^(j)k]. Thus, wavelet coefficientsd_(j,k)[t] for different values of t (and non-negative k) need not beshifted vertically (unlike in FIG. 5) for different values oft andidentical t−2^(j)k since d_(j,k)[t] only depends on j and t−2^(j)k andis thus equal to d_(j,0)[t−2^(j)k].

To compute the Haar wavelet coefficients d_(j,0)[t] and d_(J,−1)[t], afamily of intermediate multiscale average a_(j)[t] are introduced. Theycorresponds to the scaling function coefficients of the Haar waveletconstruction and are defined by

${a_{j}\lbrack t\rbrack} = {\left( {s*\left( {\frac{1}{2^{j/2}}1_{\lbrack{0,{2^{j} - 1}}\rbrack}} \right)} \right).}$

These multiscale average coefficients are represented by the lower halfof the dots in FIG. 8. In this figure, the dots represent thus both atime wavelet coefficient as well as a multiscale average coefficient.Whenever a the upper half of a dot is black, a wavelet coefficientd_(j,0)[t−2^(j)k] is stored in the corresponding register. Whenever thelower half of a dot is black, a multiscale average coefficienta_(j)[t−2^(j)k] is stored in the corresponding register. A fully blackdot means that the corresponding register contains both a waveletcoefficient d_(j,0)[t−2^(j)k] and the corresponding multiscale averagecoefficient a_(j)[t−2^(j)k]. Also note that d_(4,−1)[t]=a₄[t].

Each time time t a new frame at t is input, all wavelet coefficientsd_(j,k)[t] and multi-scale average coefficients a_(j)[t] have to becomputed using the new sample s[t]. They are computed with a recursiveprocess. Each current time multiscale average is obtained as a weightedaverage of two current time multiscale average and a past multiscaleaverage read from a buffer both of the next finer scale, or of a currenttime input frame and a past input frame read from a buffer :

${a_{1}\lbrack t\rbrack} = \frac{{s\lbrack t\rbrack} + {s\left\lbrack {t - 1} \right\rbrack}}{\sqrt{2}}$and  for  j ∈ [1, J − 1]${a_{j + 1}\lbrack t\rbrack} = {\frac{{a_{j}\lbrack t\rbrack} + {a_{j}\left\lbrack {t - 2^{j}} \right\rbrack}}{\sqrt{2}}.}$

The multiscale wavelet coefficients are then computed as weighteddifferences of two current time multiscale averages or as a weighteddifference of a current time input frame and a multiscale average :

d _(1,0) [t]=√{square root over (2)}s[t]−a ₁ [t]

and for jε[1,J−1],

d _(j+1,0) [t]=√{square root over (2)}a _(j) [t]−a _(j+1) [t].

The last wavelet coefficient d_(J,−1)[t] is equal to a_(J)[t].

The corresponding computation flow is illustrated with dashed arrows inFIG. 8. Once these coefficients are computed, d_(j,0)[t−2] . . .d_(j,0)[t] for j=1 . . . J and d_(J,−1)[t−2] . . . d_(J,−1)[t] areprocessed with the causal spatio-temporal nonlinear processor (104) andused to compute a reconstruction {tilde over (s)}[t−2].

The register shifting corresponding to incrementing the time variable tis straightforward and illustrated in FIG. 9: the arrows pointing to theleft of the figure indicate how values are transferred from one registerto another one. Note that when increasing the time index t to t+1, thevalue of a₂[(t+1)−3]=a₂[t−2] is copied from the register (901) to theregister (902), as this values will be reused at time t+2, while thevalue of d_(2,0)[(t+1)−3]=d_(2,0)[t−2] stored in the register (901) attime t is dropped and not copied since it is not going to be usedanymore. Various enhancements can be brought to this, like usingrotating shift registers in software or hardware to reduce memorybandwidth requirements, without departing from the spirit and scope ofthis invention.

Furthermore, extension of this example to different values of the timedelay d, the maximum scale J or different wavelet systems are apparentto those skilled in the art, and also within the scope of the presentinvention.

The present invention may be embodied as software run by general-purposemicro-processor or digital signal processor, in which case the modulesdescribed above with reference to FIG. 1 , FIG. 2 and FIG. 3 areunderstood to be or form part of software modules or routines. It mayalso be implemented as a hardware component, for example in an ASIC oran FPGA, in which case the above-mentioned modules can form regions ofthe solid-state component. These embodiments of the present inventioncan be incorporated in various units such as a set-top box or a highdefinition television set that receives an interlaced SDTV video andoutputs or displays an HDTV video.

FIG. 10 illustrates an exemplary embodiment of the present inventionwithin a hardware device (examples including but not limited to ASIC,FPGA, multi-chip module, system in package or system on chip). Thehardware device contains a processing block (1003) for enhancing thevideo stream according to the present invention, in addition to othervideo processing blocks (1002) and (1004), before and after the videoenhancement block (1003). In an exemplary embodiment, the videoprocessing blocks (1002), (1003) and (1004) are implemented in a singlechip (1001). The chip also has video input and output interfaces, andexternal RAM (random access memory) devices (1005) and (1006) astemporary storage required for the video processing steps performed in(1002), (1003) and (1004). Other variants of this embodiment can beequally considered as part of the invention, with more complete videoprocessing chips, or even system on chip devices including in additionto a block enhancing the video stream according to the present inventionother blocks like video decoders, analog signal demodulators, on-screendisplay modules. The hardware device can then be incorporated into avideo processing apparatus, a television set, a DVD-player or any othervideo apparatus.

While a detailed description of exemplary embodiments of the inventionhas been given above, various alternative, modifications, andequivalents will be apparent to those skilled in the art. Therefore theabove description should not be taken as limiting the scope of theinvention which is defined by the appended claims.

1. A method for enhancing a digital video stream of digital images,comprising: (a) transforming the digital images of the video stream intospatio-temporal subband images by processing circuitry of a videoapparatus, the transforming comprising a causal time wavelet transformusing filterings by multiscale causal wavelets and a spatial subbandtransform; (b) applying causal non-linear operators to saidspatio-temporal subband images by the processing circuitry, to computemodified spatio-temporal subband images; and (c) applying to saidmodified spatio-temporal subband images a delay-d spatio-temporalreconstruction by the processing circuitry, to compute enhanced videoimages delayed by a predetermined number d of images, d being anon-negative integer, wherein said delay-d spatio-temporalreconstruction comprises: i. an inverse of said spatial subbandtransform; and ii. an inverse of said time wavelet transform usingfilterings with multiscale reconstruction wavelets each multiplied witheither a window function of support [−d;0], or with a sum of nonnegativetranslates of said window function.
 2. A video enhancement methodaccording to claim 1 wherein transforming the digital images intospatio-temporal subband images comprises: (a) applying said causal timewavelet transform to said digital images to compute temporal subbandimages; and (b) applying said spatial subband transform to said temporalsubband images to compute said spatio-temporal subband images.
 3. Avideo enhancement method according to claim 1, wherein said delay-dspatio-temporal reconstruction comprises: (a) applying said delay-dinverse time wavelet transform to said modified spatio-temporal subbandimages to compute spatial subband images; and (b) applying an inverse ofsaid spatial subband transform to said subband images to obtain theenhanced video images.
 4. A video enhancement method according to claim1, wherein said delay-d spatio-temporal reconstruction transformcomprises: (a) applying said delay-d inverse time wavelet transform tosaid modified spatio-temporal subband images to compute spatial subbandimages; (b) applying spatial non-linear operators to said spatialsubband images to compute modified spatial subband images; and (c)applying an inverse of said spatial subband transform to said modifiedsubband images to obtain the enhanced video images.
 5. A videoenhancement method according to claim 1, wherein said spatial subbandtransform comprises a wavelet transform.
 6. A video enhancement methodaccording to claim 1, wherein said spatial subband transform comprises abandlet transform.
 7. A video enhancement method according to claim 1,wherein said non-linear operators comprise thresholding operators.
 8. Avideo enhancement method according to claim 1, wherein d=0.
 9. A videoenhancement method according to claim 1, wherein d>0.
 10. A videoenhancement method according to claim 1, wherein said multiscale causalwavelets comprise boundary wavelets designed for a biorthogonal wavelettransform on [0,+∞), and wherein said reconstruction multiscale waveletscomprise reversed dual boundary wavelets.
 11. A video enhancement methodaccording to claim 1, wherein said causal multiscale wavelets compriseHaar wavelets.
 12. A video enhancement method according to claim 11,wherein said causal time wavelet transform comprises: (a) computingrecursively for each scale 1 to J a multiscale average as an average ofa current time multiscale average and a past multiscale average readfrom a buffer, or of a current time input frame and a past input frameread from a buffer; (b) computing time multiscale wavelet coefficientsas weighted differences of two of said current time multiscale averagesor as a weighted difference of a current time input frame and amultiscale average.
 13. (canceled)
 14. (canceled)
 15. A video apparatus,comprising processing circuitry arranged for: (a) transforming digitalimages of a video stream into spatio-temporal subband images, thetransforming comprising a causal time wavelet transform using filteringsby multiscale causal wavelets and a spatial subband transform; (b)applying causal non-linear operators to said spatio-temporal subbandimages to compute modified spatio-temporal subband images; and (c)applying to said modified spatio-temporal subband images a delay-dspatio-temporal reconstruction to compute enhanced video images delayedby a predetermined number d of images, d being a non-negative integer,wherein said delay-d spatio-temporal reconstruction comprises: i. aninverse of said spatial subband transform; and ii. an inverse of saidtime wavelet transform using filterings with multiscale reconstructionwavelets each multiplied with either a window function of support[−d;0], or with a sum of nonnegative translates of said window function.16. A machine-readable medium having stored therein a computer programproduct, wherein the program product comprises instructions to carry outthe following when said program product is run in processing circuitryof a video apparatus: (a) transforming digital images of a video streaminto spatio-temporal subband images, the transforming comprising acausal time wavelet transform using filterings by multiscale causalwavelets and a spatial subband transform; (b) applying causal non-linearoperators to said spatio-temporal subband images to compute modifiedspatio-temporal subband images; and (c) applying to said modifiedspatio-temporal subband images a delay-d spatio-temporal reconstructionto compute enhanced video images delayed by a predetermined number d ofimages, d being a non-negative integer, wherein said delay-dspatio-temporal reconstruction comprises: i. an inverse of said spatialsubband transform; and ii. an inverse of said time wavelet transformusing filterings with multiscale reconstruction wavelets each multipliedwith either a window function of support [−d;0], or with a sum ofnonnegative translates of said window function.
 17. A video apparatusaccording to claim 15, wherein transforming the digital images intospatio-temporal subband images comprises: (a) applying said causal timewavelet transform to said digital images to compute temporal subbandimages; and (b) applying said spatial subband transform to said temporalsubband images to compute said spatio-temporal subband images.
 18. Avideo apparatus according to claim 15, wherein said delay-dspatio-temporal reconstruction comprises (a) applying said delay-dinverse time wavelet transform to said modified spatio-temporal subbandimages to compute spatial subband images; and (b) applying an inverse ofsaid spatial subband transform to said subband images to obtain theenhanced video images.
 19. A video apparatus according to claim 15,wherein said delay-d spatio-temporal reconstruction transform comprises:(a) applying said delay-d inverse time wavelet transform to saidmodified spatio-temporal subband images to compute spatial subbandimages; (b) applying spatial non-linear operators to said spatialsubband images to compute modified spatial subband images; and (c)applying an inverse of said spatial subband transform to said modifiedsubband images to obtain the enhanced video images.
 20. A videoapparatus according to claim 15, wherein said causal multiscale waveletscomprise Haar wavelets.
 21. A video apparatus according to claim 20,wherein said causal time wavelet transform comprises: (a) computingrecursively for each scale 1 to J a multiscale average as an average ofa current time multiscale average and a past multiscale average readfrom a buffer, or of a current time input frame and a past input frameread from a buffer; (b) computing time multiscale wavelet coefficientsas weighted differences of two of said current time multiscale averagesor as a weighted difference of a current time input frame and amultiscale average.